Distributed Join Processing Between Streaming and Stored Big Data Under the Micro-Batch Model
نویسندگان
چکیده
منابع مشابه
Beyond Batch Processing: Towards Real-Time and Streaming Big Data
Today, big data is generated from many sources and there is a huge demand for storing, managing, processing, and querying on big data. The MapReduce model and its counterpart open source implementation Hadoop, has proven itself as the de facto solution to big data processing. Hadoop is inherently designed for batch and high throughput processing jobs. Although Hadoop is very suitable for batch ...
متن کاملk-Means for Streaming and Distributed Big Sparse Data
We provide the first streaming algorithm for computing a provable approximation to the k-means of sparse Big data. Here, sparse Big Data is a set of n vectors in R, where each vector has O(1) non-zeroes entries, and d ≥ n. E.g., adjacency matrix of a graph, web-links, social network, document-terms, or image-features matrices. Our streaming algorithm stores at most logn · k input points in memo...
متن کاملIntelligent Distributed Processing Methods for Big Data
Motivation Today, “Big Data” is a new information overloading problem in many different areas. Such areas include health cares (e.g., medical records, bioinformatics), e-sciences (e.g., physics, chemistry, and geology), and social sciences (e.g., politics). Thus, as we have various types of feasible data from a number of available sources, it is becoming increasingly more difficult to efficient...
متن کاملBig data for Natural Language Processing: A streaming approach
Requirements in computational power have grown dramatically in recent years. This is also the case in many language processing tasks, due to the overwhelming and ever increasing amount of textual information that must be processed in a reasonable time frame. This scenario has led to a paradigm shift in the computing architectures and large-scale data processing strategies used in the Natural La...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2019
ISSN: 2169-3536
DOI: 10.1109/access.2019.2904730